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Abstract 

This paper deals with a universal coding problem for a certain kind of multiterminal source coding network called a generalized 
complementary delivery network. In this network, messages from multiple correlated sources are jointly encoded, and each decoder 
has access to some of the messages to enable it to reproduce the other messages. Both fixed-to-fixed length and fixed-to-variable 
length lossless coding schemes are considered. Explicit constructions of universal codes and the bounds of the error probabilities 
are clarified by using methods of types and graph- theoretical analysis. 

Index Terms 

multiterminal source coding, network source coding, correlated sources, universal coding, lossless coding, complementary 
delivery, vertex coloring, methods of types. 

I. Introduction 

A coding problem for correlated information sources was first described and investigated by Slepian and Wolf H], and 
later, various coding problems derived from that work were considered (e.g. Wyner |2|, Korner and Marton \3\, Sgarro f4\). 
Meanwhile, the problem of universal coding for these systems was first investigated by Csiszar and Korner [SJ. Universal 
coding problems are not only interesting in their own right but are also very important in terms of practical applications. 
Subsequent work has mainly focused on the Slepian-Wolf network fE\, 0, lEJ since it appears to be difficult to construct 
universal codes for most of the other networks. For example, Muramatsu |[9| showed that no fixed-to-variable length (FV) 
universal code can attain the optimal coding rate for the Wyner-Ziv coding problem lfTOll . 

Our main contributions in this paper include showing explicit constructions of universal codes for other multiterminal source 
coding networks. Figs. [T] and |2] illustrate the scenario we are considering: Several stations are separately deployed in a field. 
Every station collects its own target data from sensors or terminals, and wants to share aU the target data with the other 
stations. To accomplish this task, each station transmits the collected data to a satellite, and the satellite broadcasts all the 
received data back to the stations. Each station utilizes its own target data as side information to reproduce all the other data. 
Willems et al. ifTTI . lfT2l investigated a special case of the above scenario in which three stations were deployed and each 
station had access to one of three target messages, and they determined the minimum achievable rates for uplink (from each 
station to the satellite) and downlink (from the satellite to all the stations) transmissions. Their main result implies that the 
uplink transmission is equivalent to the traditional Slepian-Wolf coding system HJ, and thus we should concentrate on the 
downlink part. Henceforth we denote the networks characterized by the downlink transmission shown in Fig. |2]as generalized 
complementary delivery networlcs (Fig. [3]l, and we denote the generalized complementary delivery network with two stations 
and two target messages as the (original) complementary delivery network. This notation is based on the network structure 
where each station (decoder) complements the target messages from the codeword delivered by the satellite (encoder). 

The complementary delivery network can be regarded as a special example of the butterfly network |fT3l , |fT4 l (Fig|4|l, which 
is one of a veiy well known network structure that represents the benefits of network coding. If we assume that all the edges 
in Fig. |4] except that between nodes 3 and 4 have sufficiently large capacities, the problem is to find the minimum capacities 
of the edge between the nodes 3 and 4 satisfying that allows two messages emitted from the source (node 0) to be delivered 
to sinks 1 (node 5) and 2 (node 6). This situation is equivalent to the complementary delivery network in which the messages 
emitted from the source node are correlated with each other. Several coding problems for correlated sources over a network 
have recently been investigated. At first only one receiver was considered (e.g. ifTSl . llT6ll ) , and later networks incorporating 
multiple receivers were studied (e.g. ifTTI . US], llT9l . Il20l ). In particular. Ho et al. |18| and Kuzuoka et al. Il20l applied the 
linear Slepian-Wolf codes to random linear network coding over general 2-source multi-cast networks and universal source 
coding for the complementary delivery network, respectively. However, explicit code constructions over networks with multiple 
sources and multiple destinations still remain open. 
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Fig. 2. Data distribution: The satellite broadcasts the collected data back to the stations for sharing. Each station has already gathered its own target data, 
and thus wants to reproduce the other data by using its own target data as side information. 



This paper proposes a universal coding scheme for generahzed complementary delivery networks that involve multiple 
sources and multiple destinations. First, an explicit construction of fixed-to-fixed length (FF) universal codes based on a 
graph-theoretical analysis is presented. This construction utilizes a codebook expressed as a certain kind of undirected graphs. 
Encoding can be regarded as the vertex coloring of the graphs. The bounds of error probabilities and probabilities of correct 
decoding can be evaluated by methods of types. The proposed coding scheme can always attain the optimal error exponent 
(the exponent of error probabilities), and can attain the optimal exponent of correct decoding in some cases. This FF coding 
scheme can be applied to fixed-to-variable length (FV) universal codes. Overflow and underflow probabilities are evaluated in 
almost the same way as the error probabilities and the probabilities of correct decoding, respectively. 

This paper is organized as follows: Notations and definitions are provided in Section [ll] A generic formulation of the 
generalized complementary delivery coding system is introduced in Section [Hi] A coding scheme for FF universal codes is 
proposed in Section IV Several coding theorems for FF universal codes are clarified in Section [V] Lastly, FV universal coding 
is discussed in Section IvTl 



II. Preliminaries 

A. Basic definitions 

Let S be a binary set, B* be the set of all finite sequences in the set B and Im — {1, 2, • • • , M} for an integer M. In 
what follows, random variables are denoted by capital letters such as X, and their sample values (resp. alphabets) by the 
corresponding small letters (resp. calligraphic letters) such as x (resp. X), except as otherwise noted. The cardinality of a finite 
set X is written as \X\, and the n-th Cartesian product of X by A"". A member of X" is written as 

X — (xi,X2,''' , Xfi) , 
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Fig. 3. Generalized complementary delivery network 




Fig. 4. Butterfly network 



and substrings of a;" are written as 

— (^ij^i-f-l;'"' 1 ) ^ — j • 

When the dimension is clear from the context, vectors will be denoted by boldface letters, i.e., x e X". 

The probability distribution for a random variable X is denoted by Px- Similarly, the probability distribution for random 
variables {X, Y) is denoted by Pxy^ and the conditional distribution of X given Y is written as Px\y- The set of all probability 
distributions on X is written as V{X), and the set of all conditional distributions on X given a distribution Py G 'P{y) is written 
as V{X\Py), which means that each member Px\y of V{X\Py) is characterized by Pxy G 'P{X x 3^) as Pxv = Px\yPy- 
A discrete memoryless source (DMS) is an infinite sequence of independent copies of a random variable X. The alphabet of 
a DMS is assumed to be a finite set except as otherwise noted. For simplicity, we denote a source {X , Px) by referring to its 
generic distribution Px or random variable X. A set 

X = (X(1),X(2),... 

of Ng random variables is also called a DMS, where each random variable X^*^ takes a value in a finite set A'^*) (i G JatJ. 
For a set 5 C XjVs , the corresponding subset of sources is written as 

^(5) def. {x^^)\^eS}, 
and the corresponding subset of its sample sequences (resp. alphabets) S is denoted by 

^(5) def. -Q^(,)^ 

^(S) def. {^(^) ^x^)\^^s}. 

For a set iS C Xat^, the n-th Cartesian product of X^^\ its member and the corresponding random variable are written as 
X{S)n^ ^(S)n ^jjjj x('S)", respectively. With S ^In,, we denote X^'^^" = X". For a set 5 C X^^, its complement is denoted 

as S'^ — In, — S. 
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For a DMS X and finite sets 81^82 C IjVs that satisfy 5i = 0, the joint entropy of X^^' and the conditional entropy 
of X^-^^' given X'^^^'^ are written as H{X'^^'^) and H{X'-^^^\X'^^''^), respectively (cf. |21|). For a generic distribution 
P e 'P{X^^^'>) and a conditional distribution W e V^X^^^^P), H{P) and iJ(W^|P) also represent the joint enti'opy of 
X^-^^^ and the conditional entropy of X^"^"^ given X^^^\ where P = Px'Si) and W = Px(-52)|x('Si)- The Kullback-Leibler 
divergence, or simply the divergence, between two distributions P and Q is written as D{P\\Q). 

In the following, all bases of exponentials and logarithms are set at 2. 



B. Types of sequences 

Let us define the type of a sequence x e A"" as the empirical distribution Q^; E V{X) of the sequence x, i.e. 

Q^(a) -N{a\x) Va e A", 

n 

where N{a\x) represents the number of occurrences of the letter a in the sequence x. Similarly, the joint type Qx(s) E 'P{X^) 
for a given set 8 C Tp^^ is defined by 

def 1 

V(ai,,aj2, • • • ,a,|g|) G A"*"^). 

Let Vn{X) be the set of types of sequences in A"". Similarly, for every type Q G Vn{X), let Vn{y\Q) be the set of all 
stochastic matrices V : X such that for some pairs {x,y) E A"" x of sequences we have 



Qoo,y{x,y) = Q{x)V{y\x) = Y[Q{x^)V{y,\x,). 



i=l 

For every type Q E Vn{X) we denote 
{(re A-" 10, = Q}. 

Similarly, for every sequence x eTq and stochastic matrix V E Vn{y\Q), we define a V-shell as 

def 

{y e 3^"|0(a;)ny|2^) - Q.A^,y), y{x,y) EXxy}. 

Here, let us introduce several important properties of types. 
Lemma 1: (Type counting lemma ||2T1 Lemma 2.2]) 

|P„(A')| <(n + 1)1^1. 

Lemma 2: (Sizes of V-shells 11211 Lemma 2.5]) 
For every type Q E Vn{X), sequence x E Tq and stochastic matrix V : X y such that Ty{x) ^ 0, we have 

\T{^{x)\ > (n+l)-l'^ll^lexp{niJ(V^|g)}, 
\T{}{x)\ < eMnHiVlQ)}. 

Lemma 3: (Probabilities of types fTT Lemma 2.6]) 
For every type Q E Vn{X) and every distribution Px E V{X), we have 

Px{x) = c^^{-n{D{Q\\Px) + H{Q))} ^xeTq, 
Px{Tq) > {n+lp''\exp{-nD{Q\\Px)}, 
Px{Tq) < cxp{-nD{Q\\Px)}. 
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C. Graph coloring 

Let us introduce several notations and lemmas related to graph coloring. A (undirected) graph is denoted as G = {Vg,Eq), 
where Vq is a set of vertices and Eq is a set of edges. The degree A(t;) of a vertex v E Vq is the number of other vertices 
connected by edges, and the degree A(G) of a graph is the maximum number of degrees of vertices in the graph G. A graph 
where an edge connects every pair of vertices is called a complete graph. A complete subgraph is called a clique, and the 
largest degree of cliques in a graph G is called the clique number lu{G) of the graph G. The vertex coloring, or simply 
coloring of a graph G is where no two adjacent vertices are assigned the same symbol. The number of symbols necessary for 
the vertex coloring of a graph is called the chromatic number x{G). Similarly, the edge coloring of a graph G is where no 
two adjacent edges are assigned the same symbol, and the number of symbols necessary for edge coloring is called the edge 
chromatic number x'(G). 

The following lemmas are well known as bounds of the chromatic number and the edge chromatic number. 
Lemma 4: (Brooks lEH, ||23]| ) 

t^(G) < x(G) < A(G) 

unless G is a complete graph or an odd cycle (a cycle graph that contains an odd number of vertices). 
Lemma 5: (Vizing m, ^) 

A(G) < x'(G) < A(G) + 1. 

Lemma 6: (Konig ||23, ||23]| ) 
If a graph G is bipartite, then 

X'(G)=A(G). 



in. Problem formulation 

This section formulates the coding problem investigated in this paper, and shows the fundamental bound of the coding rate. 

First, we describe a generalized complementary delivery network. Fig. [3] represents the network formulated below. This 
network is composed of Ng sources X = one encoder <^„ and decoders (pn'^ ■ ■ ■ (p\^'^\ Each decoder (piP has 

access to side information X'-'^j-' {Sj C In,) to enable it to reproduce the information X''^'\ Since the indices S = {Sj}'^^^ 
of side information determine the network, henceforth we denote the network by S. Without loss of generality, we assume 

Sji 7^ Sj^ Vji, j2 e iNd- 

Based on the above definition of the network, we formulate the coding problem for the network. 

Definition 1: (Fixed-to-fixed generalized complementary delivery (FF-GCD) code) 
A sequence 

of codes 

is an FF-GCD code for the network S = {Sj}^^^ if 

Definition 2: (FF-GCD achievable rate) 
R is an FF-GCD achievable rate of the source X for the network S if and only if there exists an FF-GCD code 

for the network S that satisfies 

limsup — logM„ < R, 

n — >co ^ 

lim e(f) = yjeiN,. 

n— »oo 

where 

e« = Pr|x(^^)"^x'^^^"| yjeiN., 

X^^^-^" def. ^0-)(^„(x"),x(^.^)"). 
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Fig. 5. Network investigated by Willems et al. 



Definition 3: (Inf FF-GCD achievable rate) 

Rf{X\S) 

= M{R\R is an FF-GCD achievable rate of X for S}. 

Willems et al. ifTTl . lfT2ll clarified the minimum achievable rate Rf{X\S) for a special case, where Ns — — 3, 
(Xi,X2,X3) = iX,Y,Z), 5i = {1,2}, S2 = {1,3} and ^3 = {2,3} (Fig.js). 

Theorem 1: (Coding theorem of FF-GCD codes for three users ITZl ) 
If N.^Nd^ 3, (Xi, X2, X3) = (X, Y, Z), Si = {1, 2}, 52 = {1, 3} and ^3 = {2, 3}, then 

Rf{X,Y,Z\S) 

= max{i7(X, Y\Z),H{Y, Z\X),H{X, Z\Y)} 

It is easy to extend Theorem [T] to the following coding theorem for general cases: 
Theorem 2: (Coding theorem of FF-GCD codes for general cases) 

Rf{X\S) = max H (x'-^'^ X'^^^ 

Remark 1: The generalized complementary delivery network is included in the framework considered by Csiszar and Komer 
||5l . Therefore, Theorem |2] can be obtained as a corollary of their results. 

IV. Code construction 

This section shows an explicit construction of universal codes for the generalized complementary delivery network. The 
proposed universal coding scheme is described as follows: 

[Encoding] 

1) Determine a set Tn{R) C 'P„(A'(^"=)) of joint types as 

rn{R) = {Qx eVn{X^^"^'>): 
max {H{V,\Q,)} < R, Qx - QjV^, 

where i? > is a given coding rate. We note that the joint type Qx and the system S specify the type Qj and the 
conditional type Vj for every j G XjVd ■ 

2) Create a graph for every joint type Qx G 7^i(i?). An intuitive example of coding graphs is shown in Figs. |6]j7][8]and[9] 
where the network shown in Fig. js] is considered. Each vertex of the graph corresponds to a sequence set ac' 5g 

(cf. Fig. |6|. Henceforth we denote a vertex by referring to the corresponding sequence set x^^^=>^ . An edge is placed 
between vertices a;f"=^ and a;f if and only if x'^''^ = xf'\ x[^'^ G T^^{xf'^) and x^f'^ G T{;;^{xf'^) for some 
j G Xn^ (cf- Figs. |7]and[8]l. In the following, we call this graph the coding graph G{Qx)- Note that Figs. |8] and |9] show 
only a subgraph that corresponds to V-shells Ty^x'^^o^), where x^^'^^ = Xi, a;'^'^^) = and a;^'^^^ = Z4. 
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Xl,y2^Z4) 
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(^4.:>'2.Z|) (■tl.>4.22) 



(^1.3'4.Z3) 




(^3> J2. 24) 
(J3,3'1,Z4) 

[6) 

iX2,yi,Z4) 

Fig. 6. (Upper left) Intuitive example of coding graph. Each node corresponds to a sequence set {xi,yj,Zf;) S '^Q^y^- 

Fig. 7. (Upper right) For a given xi, an edge is placed between every pair of vertices whose subsequences satisfy (y^, z^) £ Ty^ (^i), which means that 
for a given xi we must distinguish each (y^ , Zfc) such that (xi,yj, z^) € ^Q^yz' 

Fig. 8. (Lower left) In a similar manner, for a given y2 (resp. 24) an edge is deployed between every pair of vertices whose subsequences satisfy 
{x,,Zk) e T^^iyi) (resp. {x„y^) e T^Jz4)). 

Fig. 9. (Lower right) Example of codeword assignment. Assigning a codeword to each sequence set can be regarded as vertex coloring of the coding graph. 



3) Assign a symbol to each vertex of the coding graph G{Qx) so that the same symbol is not assigned to any pairs of 
adjacent vertices (cf Fig. |9]l. 

4) For an input sequence set a;'^'^"^' whose joint type Qx is a member of Tn{R), the index assigned to the joint type Qx 
is the first part of the codeword, and the symbol assigned to the corresponding vertex of the coding graph is determined 
as the second part of the codeword. For a sequence set x^-^"'^ whose joint type Qx is not a member of Tn{R), the 
codeword is determined arbitrarily and an encoding error is declared. 

[Decoding: (pn^] 

1) The first part of the received codeword represents the joint type Qx of the input sequence. If no encoding error occurs, 
then Qx should be Qx, and therefore the decoder ipn'' can find the coding graph G{Qx) = G{Qx) used in the 
encoding scheme. 

2) For given side information ' and the joint type Qx, find the vertex X2 " such that (1) X2 ' — x^ ' and (ii) the 
second part of the received codeword is assigned to X2 . Such a vertex is found in the clique that corresponds to the 
set Ty (aj^'^j With Fig. 9 if x^'^'^^ — Xi is given as a side information sequence, we can find such a vertex from the 

^ — ^ ^(S ■) 

upper left clique. Note that the conditional type Vj has been determined by Qx = Qx- The sequence set x^ ^' G Tq, 

found in this step is reproduced. 

It should be noted that the above coding scheme is universal since it does not depend on the distribution Px of a source 

X. 

The coding rate of the above proposed coding scheme is determined by the chromatic number of the coding graph G{Qx)- 
To this end, we introduce the following lemmas. 

Lemma 7: The coding graph G(Q) of the joint type Q = Qx has the following properties: 
1) Every vertex set 

comprises a clique, where 
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2) Every vertex cc^-^"-'' e Tq belongs to Nd cliques, each of which corresponds to the vertex set 

3) The vertex a;(-^"=' e Tq has no edges from vertices not included in the vertex sets Ujgi^^Ty^ (a;^'^^-'). 

4) For a given joint type Q e 7'„(A'(-^™s^), both the clique number uj{G{Q)) and the degree A{G{Q)) of the coding graph 
G{Q) are constant and obtained as follows: 

toiCm = max \T{^^{x(^r% 

A(G(Q)) = Yl \TvM^^^)\- 

Proof: 1) 2) 3) Easily obtained from the first and second steps of the above encoding scheme. 4) Easily obtained from 
the above properties. ■ 
Lemma 8: The chromatic number of the coding graph G{Q) of the joint type Q e %i{R) is bounded as 

X(G(Q)) < iVrfexp(7ii?). 

Proof: This property is directly derived from Lemmas |2] |4] and |7] as follows: 

x(G(g)) < A(G(Q)) (1) 

< J2 (^MnH{Vj\Q,)} (3) 

< Ndexp{n max H{Vj\Qj)} 

< NaexpinR). (4) 

where Eq. ([T]| comes from Lemma |4j Eq. (j2]| from Lemma |7j Eq. ^ from Lemma |2] and Eq. Q from the definition of Tn{R). 
This concludes the proof of Lemma [8] ■ 

From the above discussions, we obtain 

^{G{Q)) < x{G{Q)) < A(G(Q)) < Ndexp{nR). 

V. Coding theorems 

A. General cases 

We show several coding theorems derived from the proposed coding scheme. Before showing these coding theorems, let us 
define the following function: 

en{N) -{|A'(^~=)|log(n + l) + logiV} (5) 

n 

— > (n ^ oo). 

First we present the direct part of the coding theorem for the universal FF-GCD codes, which implies that the coding scheme 
shown in Section |IV] attains the minimum achievable rate. 

Theorem 3: For a given real number R > 0, there exists a universal FF-GCD code 

W'fim'fn I I V'n >Jn=l 

for the network S such that for any integer ti > 1 and any source X 

-logM„ < i?+e„(iVd), (6) 
n 

Y.^ < 

exp|-n (-e„(7Vrf)+ min D{Qx\\Px) 
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Proof: Note that a codeword is composed of two parts: the first part corresponds to the joint type of an input sequence 
set, and the second part represents a symbol assigned to the input sequence set in the coding graph of the joint type. Therefore, 
the size of the codeword set is bounded as 

M„ < |P„(A'(^"=))| -TVdexpM) 

< iVd(n+ l)l'*'^"''lexp(ni?), (Lemma[U 

which implies Eq. Next, we evaluate decoding error probabilities. Since every sequence set a;(^"a)" whose joint type is a 
member of Tn {R) is reproduced correctly at the decoder, the sum of the error probabilities is bounded as 

< Nd Pr {X" e T^^ : Qx E T^{R) } 

< Nd ^M-nD{Qx\\Px)} (8) 

< Nd Vexp(-n min D{Qx\\Px)} 

< iV,(n+l)l^''"='l 

xexp(-n min D{Qx\\Px)} (9) 



= exp \ -n ( -CniNd) + min D{Qx\\Px) 

where Eq. (|8]l comes from Lemma [3] and Eq. (|9]l from Lemma [T] This completes the proof of Theorem |3] ■ 
We can see that for any real value R > Rf{X\S) we have 

min D{Qx\\Px) > 0. 

QxeT-{R) 

This implies that if i? > Rf{X\S) there exists an FF-GCD code for the network S that universally attains the conditions 
shown in Definition |2] 

The following converse theorem indicates that the error exponent obtained in Theorem |3] is tight. 
Theorem 4: Any FF-GCD code 

for the system S must satisfy 

E^i^'^ 

> exp j-n f e„(2)+ min D{Qx\\Px) 

for any integer n > 1, any source X and a given coding rate R — l/nlogM„ > 0. 

Proof: Note that the number of sequences to be decoded correctly for each decoder is at most cxp(ni?). Here, let us 
consider a joint type Qx e T^iR + en(2)). The definition of T^{R + e„(2)) and Lemma |2] imply that for a;(^"=) e T^^ we 
have 

max{|r^J/a:('51))|} 

> (n+ l)"!-^*^"''! max exp{niJ(t/j|Qj)} (10) 

> (n+l)-l^'""='lexpMi? + e„(2))} (11) 
= 2exp(ni?), 

where Eq. (lOi comes from Lemma |2] and Eq. ( [IT] ) from the definition of %i{R + e„(2)). Therefore, at least half of the 
sequence sets in Tq_^ will not be decoded correctly at the decoder ipn'' . Thus, the sum of the error probabilities is bounded as 



E ^- 
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Qx6r,-(i?+e„(2)) 



> 2 

X exp < — n min D(Qx\\Px] 

[ QxeT,-(fl+e„(2)) 



(12) 



= exp <^ -n e„(2) + min D{Qx\\Px) 

where Eq. (12i comes from Lemma [3] This concludes the proof of Theorem]?] ■ 

The following corollary is directly derived from Theorems [5] and [4] This shows the asymptotic optimality of the proposed 
coding scheme. 

Corollary 1: For a given real number R > 0, there exists a universal FF-GCD code 

XKfmfn I I S^n )jn=l 

for the network S such that for any source X 

hm sup — log M„ < R, 

n — *oo 

lim --log V e(/) = min D{Qx\\Px), 

where 

T{R) = {Qx e7'(A'(^«=)) : 

max H{V.j\Qj)<R, Qx = Qj Vj , 

In a similar manner, we can obtain a probability such that the original sequence set is correctly reproduced. The following 
theorem shows the lower bound of the probability of correct decoding that can be achieved by the proposed coding scheme. 
Theorem 5: For a given real number R > 0, there exists a universal FF-GCD code 

for the network S such that for any integer n > 1 and any source X 

-logM„ < i? + e„(7Vd), (13) 
n 



exp <j -n ( e„(l) + min D{Qx\\Px) 

Qx&Tn{R) 



Proof: Eq. (13 i is derived in the same way as the proof of Theorem pi Next, we evaluate the probability such that the 
original sequence set is correctly reproduced. Since every sequence set ar^"^^ whose joint type is a member of Tn{R) is 
reproduced correctly at the decoder, the sum of the probabilities is bounded as 

> Pr {X" e T^^ :Qx er„(i?)} 

> E + l)"'""*""^'' eM-nD{Qx\\Px)} 

(14) 

> (n+l)-l^'""='l 
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X exp < — n min D(Qx\\Px) 

QxeT"{R) 



= exp<^-n e„(l)+ min D{Qx\\Px' 

where Eq. ( [T4) i comes from Lemma [3] This completes the proof of Theorem [5] ■ 

The following converse theorem indicates that the exponent of correct decoding obtained in Theorem |5] might not be tight. 
Theorem 6: Any FF-GCD code 

for the network S must satisfy 

r r 

l-5Ie^f^ <exp[-n|-e„(l) 



mpcff(y,|g,)-(i? + £„(!)) 



D{Qx\\Px) 



(15) 



for any integer n>\, any source X and a given coding rate i? = l/nlogil/„ > 0, where 

Qx = QjVj, yjelNa 

and |a|+ = max{a,0}. 

Proof: Note that the number of sequences to be decoded correctly for each decoder is at most exp(ni?). Here, let us 
consider Qx & 'Pn{X^-^'^='>), Qj and Vj that satisfy Eq. ( 15 i. The ratio rdQx) of sequences in the sequence set Tq^ that 
the sequences are correctly reproduced is at most 



rdQx) 



< min < min 

< min 



exp(ni?) 



exp(ni?) • (n+l)!-^'^"'-'! 



,1 



X exp{— n max H{Vj\Qj)}, 1 



cxp{-n{ max i/(V,|Qj) - e„(l))}, 1 



(16) 



exp < — ri 



max H{Vj\Qj) - (i? + e„(l)) 



where Eq. (16i comes from Lemma |2] Therefore, the probabihty Pc{Qx) such that the original sequence pair with type Qx 
is correctly reproduced is bounded as 

PciQx) 

< r,(gx)Pr{x"er^^^} 

< exp{-n| max H{Vj\Qj) - (i?+e„(l))|^ 

+D{Qx\\Px)}, (17) 
where Eq. (17i comes from Lemma [3] Thus, the sum of the probabilities of correct decoding is obtained as 



< 



E 



PciQx) 



< ^ exp{~n\ma,x H{Vj\Qj) 

^{R+er,{l))\^ + D{Qx\\Px)} 
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< (n+l)l'^'""''lexp{- 



I mm 



( max HiV,\Q,)-iR+enil)) ^ + DiQx\\Px))} , 

= exp —n<. — e„(l) + min 

( max H{V,\Q,)~iR+enil)) ^ + DiQx\\Px))} 

where Eq. (18i comes from Lemma [T] This completes the proof of Theorem |6] 
We can see that for any real value R > Rf{X\S) and sufficiently large n we have 

f max H{Vj\Qj)-R\^ 
-DiQxWPx))} 



mm 



= 0. 



On the other hand, for any real value R < Rf{X\S) we have 

min DiQxWPx) 



> min { m^xH{V,\Q,)-R 



-^(Qxll^x))} 



> 0. 



(18) 



This implies that the exponent of correct decoding obtained in Theorem [5] might not be tight. 

Remark 2: The proof of the achievability part in the paper by Willems et al. [12] implies that any (universal) Slepian-Wolf 
code can be directly utilized as a (universal) FF-GCD code. Namely, the Slepian-Wolf code is achievable as an FF-GCD code 
if its coding rate satisfies R > Rf{X\S). However, such coding schemes cannot attain the optimal error exponent shown in 
Theorem |4] since any existing construction of universal Slepian-Wolf codes cannot attain the optimal error exponent. On the 
other hand, the coding scheme presented in Section IV can attain the optimal error exponent as shown in Theorem [5] 



B. Some special cases 

Here, let us consider a special case where the number of decoders equals Nd — 2. One of the most representative examples 
is the (original) complementary delivery network, where Ns = Nj^ = 2, Si = {1} and ^2 — {2}. We have proposed a universal 
coding scheme for the complementary delivery network ll26ll . IZTl . where we utilized a bipartite graph as a codebook. The 
following of this subsection discusses the relationships between the previous coding scheme and the new coding scheme shown 
in Section HVl 

With Nd = 2, the coding graph G{Q) can be translated into an equivalent bipartite graph (denoted by G{Q)) such that 

• each vertex in one set corresponds to a sequence a;('^i^ € Tg^, and each vertex in the other set corresponds to a sequence 

a; (52) £ r^^. 

• each edge corresponds to a sequence set a;'^'^"^-' e Tq, and the edge links between two vertices, each of which corresponds 
to the sequence subset x'^^i'^ &Tq_ (j = 1,2) of the sequence set x'^-^"''\ 



Fig. 1 1 shows an example of bipartite graphs equivalent to the coding graph shown in Fig. 10 
From the nature of the equivalent bipartite graph G{Q), we can easily obtain 

X(G) = x'(G). 

Therefore, the coding rate of the proposed coding scheme is determined by the edge chromatic number x'(G') of the equivalent 
bipartite graph G{Q). To this end, we introduce the following lemmas. 

Lemma 9: If the number of decoders equals Nd = 2, then the degree of the bipartite graph G{Q) equivalent to the coding 
graph G{Q) is constant for a given joint type Q e obtained as follows; 



A(G(Q)) 



max \TJ} (x 
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(xi.ys) 



(X5>y5) 




(X4>y5) 



(M^yi) 



ixi,y\} 

ixhy3)/~\xi,y2) 

(■*^3,3'3)C ) — O — O — QixA^yi) 



ixs^yi) 



{X3,y4) 



Fig. 10. Example of the coding graph when A^^j = 2, where each vertex with a gray center corresponds to another vertex with a gray verge. For example, 
the vertex (0:3, 1/4) exists at the top left and the bottom right. 



X] Xa X' 



row 



column 



Fig. 11. Bipartite graph equivalent to the coding graph shown in Fig. |10| 




where x'-^j^ E Tg . This equals the clique number uj{G{Q)) of the coding graph G{Q). 

Proof: We can easily obtain this lemma from the fact that the number of edges connected to the node x^^i"^ equals 

Lemma 10: If the number of decoders equals N^t = 2, then for a given joint type Q e Tn{R) the edge chromatic number 
of the bipartite graph G{Q) equivalent to the coding graph G{Q) is bounded as 

X'(G(Q)) <exp(ni?). 

Proof: This property is directly derived from Lemmas |2] |6] and |9] as follows: 

x'iCm = A(G(Q)) 



< max exp{ni/(y, |(5j)} 

< exp(ni?), 



(19) 
(20) 

(21) 
(22) 
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where Eq. ([19]) comes from Lemma |6] Eq. ( [20| from Lemma |9] Eq. pT) from Lemma |2j and Eq. ( [22] l from the definition of 



10 



Tn{R). This concludes the proof of Lemma 
To summarize the above discussions, we obtain 

X{G{Q)) - x'(G(g)) - cj{G{Q)) < exp{nR). 

From the above discussions, we can obtain the following direct theorems for the universal FF-GCD codes of Nd = 2, which 
cannot be derived as corollaries of the theorems shown in the previous section. 

Theorem 7: If the number of decoders equals Nd = 2, then for a given real number R > there exists a universal FF-GCD 
code 



}C30 
n=l 



for the network S such that for any integer n> 1 and any source X 

-logM„ < i? +£„(!), 
n 

e(i) + e(2) < 



exp <^ -71 -e„(2) + min p{Qx\\Px) 
Theorem 8: For a given real number R > 0, there exists a universal FF-GCD code 

{(<^n,^(\),'?(2))}^=l 

for the network S such that for any integer n > 1 and any source X 

-logMn < i? + e„(l), 
n 

l-(eW + e(^)) > 



exp <^ -n e„(l) + min D{Qx\\Px) 

The previous universal coding scheme for the original complementary delivery network utilized a bipartite graph as a 
codebook, and derived coding theorems that were special cases of Theorems [7] and |8] 

VI. Variable-length coding 

This section discusses variable-length coding for the generalized complementary delivery network, and shows an explicit 
construction of universal variable-length codes. The coding scheme is similar to that of fixed-length codes, and also utilizes 



the coding graphs defined in Section IV 



A. Formulation 

Definition 4: (Fixed-to-variable generalized complementary delivery (FV-GCD) code) 
A sequence 

of codes 

is an FV-GCD code for the network S if 



where 



Pr ^ = 0, Vj- e In,, 

^iS.)n def. ^C,-)(^^(jt«),x(^.^)"). 



and the image of (pn is a prefix set. 



IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. XXX, NO. XXX, XXXXX 200X 



15 



Definition 5: (FV-GCD achievable rate) 
R is an FV-GCD achievable rate of the source X for the network S if and only if there exists an FV-GCD code 

XVfinfn 1 :fn )fn=l 

for the network S that satisfies 

limsup-£;[/((p„(X"))] < R, 

n — >oo ^ 

where : ^ {1, 2, 3, • • • } is a length function. 
Definition 6: (Inf FV-GCD achievable rate) 

Rv{X\S) 

= mi{R\R is an FV-GCD achievable rate of X for S}. 



B. Code construction 

We construct universal FV-GCD codes (variable-length codes) in a similar manner to universal FF-GCD codes (fixed-length 
codes). Note that the coding rate depends on the type of sequence set to be encoded when constructing variable-length codes, 
whereas the coding rate is fixed beforehand for fixed-length coding. The coding scheme is as follows: 
[Encoding] 

1) Create a coding graph for each joint type Qx € 'P„(A''^'^™s^) and assign a symbol to each vertex of the coding 
graph G{Qx) in the same way as Steps 2 and 3 of Section |lV] Note that a coding graph is created for every type 

2) For an input sequence set a;'^'^"-''' e Tq^' index assigned to the joint type Qx is the first part of the codeword, and 
the symbol assigned to the corresponding vertex of the coding graph is determined as the second part of the codeword. 
Note that a codeword is assigned to every input sequence set a;'-^"=' G A'^-^^^'", and the codeword length depends on 
the type of input sequence set. 

[Decoding] 

Decoding can be accomplished in almost the same way as the fixed-length coding. Note that the decoder can always find the 
coding table used in the encoding scheme, and therefore it can always reconstruct the original sequence. 



C. Coding theorems 

We begin by showing a coding theorem for (non-universal) variable-length coding, which indicates that the minimum 
achievable rate of variable-length coding is the same as that of fixed-length coding. 
Theorem 9: (Coding theorem of FV-GCD code) 

R.{X\S) = RfiX\S) 

= mux H{X^^^^\X^^^^) 

Direct part: 

We can apply an achievable FF-GCD code (fixed-length code) when creating an FV-GCD code. The encoder ipn assigns the 
same codeword as that of the fixed-length code to a sequence set a;^-^"' ) G ^(2^nJ" if the fixed-length code can correctly 
reproduced the sequence set. Otherwise, the encoder sends the sequence set itself as a codeword. 

The above FV-GCD code can always reproduce the original sequence set at every decoder, and it attains the desired coding 
rate. 

[Converse part] 
Let an FV-GCD code 

for the network S be given that satisfies the conditions of Definitions |4] and |5] From Definition |5] for any 5 > there exists 
an integer ni = ni{6) and then for all n > ni{S), we can obtain 

-E[l{^,,{X''))] < R+S. (23) 
n 

Here, let us define An — (/3„(X"). Since the decoder ^l^-* {j = 1,2, •• • ,Nd) can always reproduce the original sequence set 
jS^C-Sj)" fjom the received codeword An and side information X'^'^j^", we can see that 

^(X('5j)"|A„X('^^')") = Vj e Nd. (24) 
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Substituting A„ into Eq.([23]l, we have 

n{R+5) > E[l{An)] 

> H{An) (25) 

^ (26) 



where Eq. (25i comes from the fact that A„ is a prefix set, and Eq. (26i from Eq. (24i. Since we can select an arbitrarily 
small 5 > for a sufficient large n, we can obtain 

n 

Since the above inequality is satisfied for all j e Itv^, we obtain 

R > max H{X^^'^\X^^^^^). 

This completes the proof of Theorem [9] ■ 

The following direct theorem for universal coding indicates that the coding scheme presented in the previous subsection can 
achieve the inf achievable rate. 

Theorem 10: There exists a universal FV-GCD code 

XVfmfn I I V'n )Sn=l 

for the network S such that for any integer n > 1 and any source X, the overflow probability Pj^{R), namely the probability 
that codeword length per message sample exceeds a given real number i? > 0, is bounded as 



PniR) 

def 



Pr (X")) >ni?} 



< exp\-n[-eniNd)+ min D{Qx\\Px) 
This implies that there exists a universal FV-GCD code 

for the network S that satisfies 

limsup-;(</7„(X")) < Ry{X\S) a.s. (27) 

n — >oo ^ 

Proof: The overflow probability can be obtained in the same way as an upperbound of the error probability of the FF-GCD 
code, which has been shown in the proof of Theorem [3] Thus, we have 



VPr -/(^„(X")) >i?,(X|5) 
— ' n 

n=l ^ 



+ 6} < oo 



for a given 5 > 0. From Borel-Cantelli's lemma 11281 Lemma 4.6.3], we immediately obtain Eq. (27 1. This completes the proof 
of Theorem [lO] ■ 

The converse theorem for variable-length coding can be easily obtained in the same way as Theorem [4] 
Theorem 11: Any FV-GCD code 

for the network S must satisfy 

PniR) 



> exp<^-n e„(2)+ min DiQx\\Px] 



for a given real number R > and any integer n > 1. 



The following corollary is directly derived from Theorems 10 and 11 
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Corollary 2: There exists a universal FV-GCD code 

for the network S such that for any source X 

limsup-;(</j„(X")) < Ry{X\S) a.s. 

lim --logp„(i?) = min DiQx\\Px) 

Next, we investigate the underflow probabiHty, namely the probability that the codeword length per message sample falls 
below a given real number R > 0. For this purpose, we present the following two theorems. The proofs are almost the same 
as those of Theorems |5] and |6] 

Theorem 12: There exists a universal FV-GCD code 

XWm'-Pn 1 I V'n )Sn=l 

for the network S such that for any integer n > 1 and any source X, the underflow probability P^{R) is bounded as 

pjRf^^-Pr{l{^^{X-))<nR} 

< 

expi-n ( £„(!)+ min D{Qx\\Px'. 

This implies that there exists a universal FV-CD code 

for the network S that satisfies 



liminf-/((^„(X")) > Ry{X\S) a.s. 



n — >oC' ft 



Theorem 13: Any FV-GCD code 
for the network S must satisfy 

pAR) < exp -n|-e„(l) + 



max i7(y,|g,)-(i? + e„(l)) 



mm 

+ D{Qx\\Px) 



for a given real number R > and any integer n > 1. 

VII. Concluding remarks 

This paper dealt with a universal coding problem for a multiterminal source network called the generalized complementary 
delivery network. First, we presented an explicit construction of universal fixed-length codes, where a codebook can be expressed 
as a graph and the encoding scheme is equivalent to vertex coloring of the graph. We showed that the error exponent achieved 
with the proposed coding scheme is asymptotically optimal. Next, we applied the proposed coding scheme to the construction 
of universal variable-length codes. We showed that there exists a universal code such that the codeword length converges to 
the minimum achievable rate almost surely. 

Two important problems remains to be solved: First, the proposed coding scheme is impractical owing to the difficulty of 
finding codewords from the coding table and the substantial amount of storage space needed for the coding table. Second, this 
paper dealt only with lossless coding, and therefore the construction of universal lossy codes still remains an open problem. 
We have investigated the above mentioned problems for the (original) complementary delivery network, and proposed simple 
coding schemes for both lossless and lossy coding ll20l . However, these coding schemes cannot be directly extended to the 
generalized complementary delivery network. Practical coding schemes for the generaUzed complementary delivery network 
should be addressed. 
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